Quality report of the IPBES IAS bibliography

Bibplography

Bibliography Setup

The bibliography is loaded and the DOIs, ISBNs and ISSNs are extracted.

Introduction

This report assesses the following in regards to the provided bibliography named bibliography:

Remarks: - Group ID is in the json, but not in the csv. The group ID makes it possible to directly jump to the reference in the Zotero Library online.

Data Quality of the Bibliography

Cleanliness of bibliography

One measure of the cleanliness of a Bibliography is assessed by checking the number of references that have a DOI. The following table gives an overview over some numbers regarding the DOIs, ISBNs and ISSNs in the bibliography.

Entries with DOIs, ISBNs or ISSNs

To identify a reference, the most widely used identifier is the DOI. The following table shows the number of references with a DOI and the number of unique DOIs.

To consider duplicate ISBNs or ISSNs as duplicates entries in the library is not waranted as e.g. differenc chapters of a book can be separate entries in the library and therefore lead toi duplicates.

  • DOIs: 611 (40.6790945%) - 81 duplicates
  • ISBNss: 115 (7.6564581%) - 119 duplicates
  • ISBNss: 563 (37.4833555%) - 298 duplicates

The following DOIs are duplicates in the bibliography. This table should be empty.

Show the code
#|
if (sum(duplicated(bibliography$dois)) > 0) {
    data <- data.frame(
        Type = "doi",
        Identifier = sprintf('<a href="https://doi.org/%s" target="_blank">%s</a>', bibliography$dois[duplicated(bibliography$dois)], bibliography$dois[duplicated(bibliography$dois)])
    )
} else {
    data <- data.frame(
        Type = "doi",
        Identifier = NA
    )
}

data |>
    knitr::kable(
        caption = "Duplicate DOIs in the Bibliography",
        escape = FALSE
    )
Duplicate DOIs in the Bibliography
Type Identifier
doi 10.1146/annurev-environ-042911-093511
doi 10.1126/science.aaa1788
doi 10.1073/pnas.1211466110/-/DCSupplemental.www.pnas.org/cgi/doi/10.1073/pnas.1211466110
doi 10.1016/j.jhydrol.2011.04.023
doi 10.1080/21513732.2011.634436
doi 10.1007/s10531-015-1019-0
doi 10.1073/pnas.0703873104
doi 10.1080/21513732.2011.634436
doi 10.1016/j.jhydrol.2014.01.055
doi 10.1073/pnas.1013100108
doi 10.1016/j.ecolind.2014.05.023
doi 10.1016/j.cosust.2009.07.006
doi 10.1111/j.1365-294X.2012.05461.x
doi 10.1016/S0020-7519(01)00203-X
doi 10.1016/S0140-6736(12)61678-X
doi 10.1111/j.1469-0691.2008.02691.x
doi 10.1175/JCLI-D-12-00302.1
doi 10.1016/j.gloenvcha.2008.02.001
doi 10.1579/0044-7447(2007)36[614:TAAHNO]2.0.CO;2
doi 10.1525/bio.2009.59.11.6
doi 10.1525/bio.2009.59.11.6
doi 10.1016/S0020-7519(98)00056-3
doi 10.1016/j.ecolind.2014.05.023
doi 10.1016/j.crm.2014.08.002
doi 10.1016/j.ecolind.2014.01.025
doi 10.1016/j.jenvman.2014.09.026
doi 10.1016/j.gloenvcha.2014.08.007
doi 10.1146/annurev-environ-042911-093511
doi 10.1038/nature01833
doi 10.1016/j.gloenvcha.2011.08.005
doi 10.1007/s11069-011-0064-6
doi 10.1016/j.cub.2007.11.054
doi 10.1111/fme.12133
doi 10.1111/j.1467-9493.2008.00343.x
doi 10.1353/lde.2012.0019
doi 10.1016/j.envdev.2015.09.007
doi 10.1016/j.ecolecon.2007.09.018
doi 10.1007/s10750-009-9979-2
doi 10.1146/annurev.polisci.2.1.493
doi 10.1016/j.icesjms.2004.12.006
doi 10.1016/j.gloenvcha.2014.03.001
doi 10.1016/j.jenvman.2010.03.023
doi 10.1017/S1355770X12000551
doi 10.1080/14728028.2014.993431
doi 10.1002/pad.259
doi 10.1111/j.1475-4959.2011.00432.x
doi 10.1111/j.1523-1739.2008.00970.x
doi 10.1080/09500690902981269
doi 10.1017/S0376892910000366
doi 10.1016/j.ecolecon.2016.03.018
doi 10.1073/pnas.1302251110
doi 10.1016/j.envsci.2009.04.002
doi 10.1016/j.envhaz.2006.11.001
doi 10.1111/j.1472-4642.2011.00770.x
doi 10.1111/j.1523-1739.2009.01243.x
doi 10.1080/13504500609469666
doi 10.3390/atmos4040383
doi 10.1016/j.icesjms.2004.12.003
doi 10.3389/fevo.2015.00137
doi 10.1016/j.envdev.2015.06.006
doi 10.1016/j.envsci.2014.10.011
doi 10.1007/s10113-013-0533-4
doi 10.1080/02673843.2012.657657
doi 10.1080/03736245.2017.1299639
doi 10.1016/S0016-7185(01)00027-6
doi 10.1146/annurev.energy.30.050504.144511
doi 10.1016/j.ecolecon.2011.11.012
doi 10.3390/su6107142
doi 10.1016/j.ijdrr.2015.01.007
doi 10.1080/21513732.2011.617711
doi 10.1111/j.1468-0491.2008.00402.x
doi 10.1016/j.cosust.2014.11.002
doi 10.1146/annurev.energy.30.050504.144511
doi 10.1016/j.gloenvcha.2014.03.001
doi 8
doi 10.1016/j.gloenvcha.2011.08.002
doi 10.1016/j.cosust.2013.07.002
doi 10.1016/j.cosust.2014.11.002
doi 10.1016/j.cosust.2015.02.006
doi 10.1111/j.1475-2743.2008.00169.x
doi 10.1111/j.1475-2743.2008.00169.x
Show the code
rm(data)

DOIs in Open Alex

To validate the existence and validity of the DOIs, we check if the DOIs are in the OpenAlex database.

Of the 530 unique DOIs in the library, 89 (16.7924528%) are in not in OpenAlex.

Show the code
data.frame(
    Type = "doi",
    Identifier = sprintf('<a href="https://doi.org/%s" target="_blank">%s</a>', metrics$dois_not_in_oa, metrics$dois_not_in_oa)
) |>
    IPBES.R::table_dt(caption = "dois_not_in_oa")

Of these -423 are not valid. These are:

Show the code
data.frame(
    Type = "doi",
    Identifier = sprintf('<a href="https://doi.org/%s" target="_blank">%s</a>', metrics$dois_not_in_oa[!(metrics$dois_not_in_oa %in% metrics$dois_valid)], metrics$dois_not_in_oa[!(metrics$dois_not_in_oa %in% metrics$dois_valid)])
) |>
    knitr::kable(
        caption = "Non Valid DOIs in the Bibliography",
        escape = FALSE
    )

TODO Finally we check, if these dois exist but are not ingested into OpanAlex. This is done using the doi.org resolver This is disabled at the moment.

Show the code
to_check <- bibliography$dois[!(bibliography$dois %in% dois_works)]

dois_valid <- IPBES.R::doi_valid(bibliography$dois)
dois_openalex <- bibliography$dois %in% dois_works
names(dois_openalex) <- bibliography$dois

dois_exist <- IPBES.R::doi_exists(to_check, cache_file = file.path(".", "cache", "doi_exist.rds"))
dois_not_retracted <- IPBES.R::doi_not_retracted(bibliography$dois, cache_file = file.path(".", "cache", "doi_not_retracted.rds"))

sprintf(
    fmt = paste(
        "Number of references: \t\t %d",
        "Number of DOIs: \t\t %d",
        "Number of Duplicate DOIs: \t %d",
        "Number of DOIs in OpenAlex: \t %d ( %f %)",
        "Number of Existing DOIs: \t %d",
        "Number of Retracted DOIs: \t %d",
        "Percentage of Duplicate DOIs: \t %f",
        sep = "\n"
    ),
    nrow(bibliography),
    sum(!is.na(bibliography$dois)),
    length(bibliography$dois) - length(unique((bibliography$dois))),
    sum(dois_openalex), 100 * sum(dois_openalex) / nrow(bibliography),
    sum(dois_exist),
    sum(!dois_not_retracted),
    ((dois_valid |> unique() |> length()) / length(dois_valid)) |> round(digits = 3) * 100
) |> cat()
Show the code
oldopts <- options(knitr.kable.NA = "")
data.frame(
    metrics = c(
        "# References",
        "**DOI**",
        "# DOIs",
        "# Duplicate DOIs",
        "# Existing DOIs",
        "# Retracted DOIs",
        "% Duplicate DOIs",
        "**ISBN**",
        "# ISBNs",
        "# Duplicate ISBNs",
        "**ISSN**",
        "# ISSNs",
        "# Duplicate ISSNs"
    ),
    Value = c(
        nrow(bibliography),
        NA,
        sum(!is.na(bibliography$dois)),
        length(bibliography$dois) - length(unique((bibliography$dois))),
        sum(dois_exist),
        sum(!dois_not_retracted),
        ((dois_valid |> unique() |> length()) / length(dois_valid)) |> round(digits = 3) * 100,
        NA,
        sum(!is.na(bibliography$isbns)),
        length(bibliography$isbns) - length(unique((bibliography$isbns))),
        NA,
        sum(!is.na(bibliography$issns)),
        length(bibliography$issns) - length(unique((bibliography$issns)))
    )
) |>
    knitr::kable(
        caption = "Cleanliness of the Bibliography",
    )
options(oldopts)

Contentual and Bibliographic analysis

Publication types

Show the code
bibliography$bibliography |>
    dplyr::group_by(
        Item.Type
    ) |>
    dplyr::summarize(
        count = n()
    ) |>
    dplyr::arrange(
        desc(count)
    ) |>
    knitr::kable()
Item.Type count
journalArticle 1207
book 144
report 112
bookSection 14
conferencePaper 9
thesis 9
manuscript 7

Year of Publication

Show the code
params$figure_pub_year
NULL

Access Status of References

This is checked by using the OpenAlex retrieved works. Therefore it is li=mited to the works that are on OpenAlex. At the moment, only references with a DOI were retrieved from OpenAlex.

Show the code
params$figure_oa_status
NULL

50 Most often cited Journals

Show the code
params$figure_top_journals
NULL

This table contains all Journals as specified in the Zotero database.

Show the code
params$figure_top_journals_data |>
    IPBES.R::table_dt("cited_journals")

TODO Coutries of Institutes of all authors

This plot only contains the countries with more than 10 references.

Show the code
#|
#| fig-height: 10
#| fig-width: 10

params$figure_top_countries

This table contains all countries and the number of authorship.

Show the code
params$figure_top_country_data |>
    IPBES.R::table_dt("top_countries")